Evaluation Criteria for Automatic Essay Assessment Systems – There is much more to it than just the correlation

نویسندگان

  • Tuomo Kakkonen
  • Erkki Sutinen
چکیده

Automatic essay grading systems are usually evaluated by comparing the correlation between the grades assigned by the assessment system and a set of human graders. We argue that this method alone is inadequate for evaluating state-of-the-art assessment systems, and define a set of evaluation criteria that covers all the relevant aspects of an essay assessment system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Statistical Model for Evaluation Interactive Question Answering Systems Using Regression

The development of computer systems and extensive use of information technology in the everyday life of people have just made it more and more important for them to make quick access to information that has received great importance. Increasing the volume of information makes it difficult to manage or control. Thus, some instruments need to be provided to use this information. The QA system is ...

متن کامل

Raters’ Perception and Expertise in Evaluating Second Language Compositions

The consideration of rater training is very important in construct validation of a writing test because it is through training that raters are adapted to the use of students’ writing ability instead of their own criteria for assessing compositions (Charney, 1984). However, although training has been discussed in the literature of writing assessment, there is little research regarding raters’ pe...

متن کامل

A Mixed Method Study of Interventionist DA: A Case of Introvert vs. Extrovert EFL Learners’ Academic Essay Writing

Today, great a number of assessment methods have been practiced in educational systems. However, Dynamic Assessment (DA), as the modern assessment method with its emphasis on improvement and development of learning through joining teaching and assessment, is of paramount significance. Thus, one can call DA as a major and revolutionizing factor in teaching and assessment. So far, some conducted ...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

High Stakes Require More Than Just Talk: What to Do About Corruption in Health Systems; Comment on “We Need to Talk About Corruption in Health Systems”

Reluctance to talk about corruption is an important barrier to action. Yet the stakes of not addressing corruption in the health sector are higher than ever. Corruption includes wrongdoing by individuals, but it is also a problem of weak institutions captured by political interests, and underfunded, unreliable administrative systems and healthcare delivery models. We ur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008